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DEBUGGER PROGRAM WHICH INCLUDES CORRELATION OF COMPUTER 
PROGRAM SOURCE CODE WITH OPTIMIZED OBJECT CODE 

Field of the Invention 

The invention relates in general to the compilation of 
computer program source code to produce object code and in 
particular to a debugger program for use when such 
5 compilation involves optimization of the object code. 
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Backgroun d of fch<> Inventinti 

In the development of computer software, it is 
necessary to perform a function termed "debugging" which 
involves testing and evaluating the software to find and 
correct errors and improper operation. An effective 
debugger program is necessary for rapid and efficient 
development of software. 

The original coding for a computer program is termed 
the source code. This is the code that is written and 
understood by a programmer. The source code is processed by 
a program termed a compiler to produce an assembly code. 
The assembly code is then further processed by a program 
termed an assembler to produce an object file. Multiple 
object files are linked by a loader program to produce an 
executable program which is termed the object code, which 
comprises binary machine language instructions that can be 
executed directly by a computer. 

In testing software, the object code must be executed 
by a computer in a testing phase so that proper operation of 
the code can be determined and any errors or spurious 
operations can be detected, when a problem is detected at 
a particular location in the object code, a correction of 
that error must generally be made in the original source 
code. However, in most cases, it is not readily apparent 
which portion of the object code relates directly back to a 
particular line or construct of the source code. It is 
therefore an important function of the debugger program to 
provide such a relationship. This is termed debugging of 
the source code at a symbolic, source code, level. 

A debugger program crucially relies on its ability to 
map from a symbolic construct in the source code to a 
point (s) in the object code and to map back from a point in 
the object code to a construct (s) in the source code, with 
this capability, the debugger program allows a user to: 

1. set breakpoints and tracepoints in source 
code terms. 
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2. allow the debugger to incrementally step 
(execute) constructs in the source or object 
code . 

3. allow the debugger program to communicate 
the current point of execution in the object 
code to terms of the source code. 

4. given a point in the program (object code), 
allows the debugger program to determine the 
correct lexical context in which to 
interpret user defined program symbols. 

A major factor of complexity arises in the operation of 
a debugger program when the object code is optimized. 
Optimization is a process performed in certain compilers 
which enhances the speed of operation of the resulting 
compiled code. When a source code program is compiled with 
optimization, the relationships between the source code and 
the resulting object code is much more complex than in the 
case of compilation without optimization. 

In compiler-debugger systems, without optimization, the 
source code to object code correlation is accomplished by 
partitioning the source code and then relating each 
partition to a single instruction in the object code. For 
example, in most Unix debugger programs, the source code is 
partitioned into lines and each line is related to a single 
instruction in the object code. When there is no 
optimization of the object code, it is sufficient to relate 
a source code construct to a single contiguous set of 
instructions to provide basic debugging functionality. 
Further, the instructions which comprise a source code 
construct will appear in a contiguous section of the object 
code. All instructions which correspond to a particular 
source code construct can be readily determined. 

However, in the presence of optimization for the object 
code, all instructions which correspond to a source code 
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construct will not necessarily be contiguous in the object 
code. The use of optimization can drastically reorder 
eliminate, replicate, factor, fuse and transform source code 
elements. Instructions corresponding to a particular source 
code construct may appear in widely disparate sections of 
the object code. Because of such fragmentation in the 
object code for a source code construct, correlating a 
source code construct to a single instruction is not 
sufficient to enable standard debugging functionality. 

The process of optimization for »C" code is described 
in "CONVEX «c» OPTIMIZATION GUIDE" , 2nd Edition, Convex 
Press, Richardson, Texas, April, 1991. Operation of FORTRAN 
code is described in "CONVEX FORTRAN OPTIMIZATION GUIDE" , 
3rd Edition, convex Press, Richardson, Texas, November' 
1991. ' 

In view of the desirability to optimize object code, 
there exists a need to relate each source code construct to 
one or more ranges of instructions in the object code. 
There is further a need to provide a system and method for 
communicating this correlation of source code to object code 
to the debugger program. Still further, there is a need to 
provide a system and method for utilizing this information 
to accomplish the functions required in a debugger program 
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Summary of the Invention 

A selected embodiment of the present invention 
comprises a debugger program for use with object code which 
is produced by the optimized compiling of source code. A 
5 compiler program provides correlation between machine 

instructions in the object code and elements (source units) 
of the source code. The source units reflect the syntax of 
the source code. The compiler produces a source unit table 
which comprises a plurality of source units each entry of 

10 which includes a unique index, a position identification in 

the source code, a context identification of the source unit 
in the source code, a linkage to other source units where 
such linkage exists and a type identification for the source 
unit. The compiler program further produces a source range 

15 table which specifies one or more ranges of the machine 

instructions which are associated with each of the source 
units. Each element of the source code can be related to 
corresponding machine instructions through the source units 
and the source range table. The debugger program uses both 

20 the source unit table and source range table. 

A further aspect of the present invention is a method 
for determining correlation between optimized object code 
and corresponding source code. This method includes the 
steps of processing the source code to produce a plurality 

25 of source units. Each of the source units includes a unique 

index, a position identification in the source code, a 
context identification of the source unit in the source 
code, a linkage to other source units where such linkage 
exists and a type identification for the source unit. In a 

30 further step, the source code is processed to produce 

compiler nodes which include both entry nodes and 
computation nodes. The processing of the source code 
produces a sequence of machine instructions. An annotation 
is generated for each of the compiler nodes to identify the 
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source units related to each of the compiler nodes. 
Finally, a source range table is generated which specifies 
one or more ranges of the machine instructions which are 
associated with each of the source units. Each element of 
the source code can be related to corresponding ones of the 
machine instructions by use of the source units and the 
source range table. 

A further aspect of the present invention comprises a 
method of operation of a debugger program in which source 
units are highlighted on a display to indicate that 
corresponding elements of the object file, machine language 
instructions, have been executed or are to be executed by 
the computer. This allows the programmer to visualize the 
execution of the program by the sequential highlighting of 
the units of the source code as the program steps through 
the machine language instructions. 
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Brief Description of the Drawings 

For a more complete understanding of the present 
invention and the advantages thereof, reference is now made 
to the following description taken in conjunction with the 
accompanying drawings in which: 

Figure l is a block diagram illustrating the general 
environment for compiling and assembling source code into 
object code and including aspects representing the present 
invention and its relationship to the general environment, 

Figure 2 is a block diagram overview of an optimized 
compilation process with multiple steps of optimization, 

Figure 3 is an illustration of source units in a sample 
FORTRAN routine entitled "munge" , 

Figure 4 is a schematic illustration of compiler nodes 
which include both entry nodes and computation nodes, 

Figure 5 is an illustration of a node tree with 
compilation node annotations, 

Figure 6 is an illustration of a source unit stack for 
the FORTRAN routine "munge" shown in Figure 3 , 

Figure 7 is a schematic illustration of the initial 
source unit annotations for all of the nodes within the 
FORTRAN routine "munge" shown in Figure 3, 

Figure 8 is an illustration of basic blocks and 
corresponding annotations for the FORTRAN routine "munge" 
shown in Figure 3 , 

Figure 9 is an illustration of a prototypical loop, 

Figure 10 is an illustration of a FORTRAN DO loop which 
includes an invariant variable, 

Figure 11 is a table entitled "PRE-CODE MOTION" which 
illustrates source unit annotations before code motion for 
the FORTRAN routine "munge" shown in Figure 3, 

Figure 12 is a table entitled "PRE-CODE MOTION" which 
illustrates source unit annotations after code motion for 
the FORTRAN routine "munge" shown in Figure 3, 
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Figure 13 is an overall illustration of the mapping of 
source units in the original source code to the machine 
instructions in the object module, 

Figure 14a is an illustration of a display in which a 
particular line of the object code being executed is 
highlighted, Figure 14b is a display highlighting the source 
units and the source code of the subroutine which 
corresponds to the machine instruction under execution as 
shown in Figure 14a, 

Figure 15a is an illustration of machine instructions 
with one instruction noted as being in execution, 

Figure 15b is a screen display illustrating a highlight 
of the source unit within the source code that corresponds 
to the particular machine language instruction in execution 
15 as shown in Figure 15a, 

Figure 16 is a flow diagram illustrating the 
operational steps for generating the source unit table and 
source range table of the present invention, and 

Figure 17 is a flow diagram illustrating the use of the 
present invention for debugging a program. 



20 
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Detailed Description of the Invention 

The overall flow of information and relationships of 
various programs used in compiling-debugging code is 
illustrated in a flow diagram 2 0 shown in Figure 1. 
Briefly, the solid arrows represent the conventional flow of 
information in a traditional compiler-debugger environment. 
The dashed arrows represent the information and data flow 
added to the conventional process in accordance with the 
present invention. 

A source file 22 is produced by a programmer in any one 
of many available languages such as "C" , FORTRAN or others. 
The source file 22 is provided to a compiler 2 4 which 
produces assembly code 26. In one aspect of the present 
invention, the compiler 24 further produces a source unit 
table 28, which is further defined below and an example is 
shown in Table l. For each source file, such as 22, there 
is provided a separate source unit table 28. The source 
unit table 2 8 is provided to a debugger program 30. 
Likewise, the source file 22 is provided to the debugger 
program 30. 

The assembly code 2 6 is provided to an assembler 3 8 
which in turn produces an object file 40. A further aspect 
of the present invention is that the compiler 24 provides a 
set of source range directives, further defined below, as 
indicated by the dashed arrows 25 and 27, to the assembler 
38. The assembler 38 produces a source range table 42, 
further defined below and shown as an example in Table 2, 
which is provided to the debugger program 30. The source 
range table correlates source units (i.e. source code in 
file 22) to instructions in the object file 40. The source 
range directives are entries which go into the source range 
table 42. 

A loader 44 receives one or more of the object files 40 
and links them together to produce an executable file 46. 
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The object file 40 and executable file 46 comprise machine 
language instructions for direct execution by a computer 
The executable file 46 is also provided to the debugger 
program 30. 

The debugger program 3 0 receives the source file 22 and 
the executable file 46 as well as the source unit table 28 
derived from the source file 22. For each object file 40 
there is also provided a separate source range table 42 to 
the debugger program 30. 

The present invention provides a mechanism and process 
for correlating source code to object code where the object 
code has been subject to one or more optimization routines. 
The source code to object code correlation is a set-based 
mechanism which is used to correlate sets of source units to 
compiler nodes. A source unit is defined herein to be a 
language independent syntactic piece of a program. All of 
the source units from a particular file are grouped together 
into the source unit table (see Table 1) . The source unit 
table is produced during the compilation process and it is 
provided as a data file to the debugger program 30. 

An example of the source unit table 28 is shown in 
Table 1 below. This is a table for the routine «mung e .. 
shown in Figure 3 . 

SOURCE UNIT TABLE 

SPVRCEUNTTTNnF Y STAET^SmON FND POSTTTO|y SRXJNDEX 

0 Routine -> v n _ 

1 Block » ft V 7 

6x 7 10 x 9 mi 

2u>op - «* 7 9x n P) 

3Statemem " 6x10 6x 14 

4 Expression ~ 6 x 14 * 

° * J * 6 x 14 ( ) 

5 Expression - 6 x 17 fi x p ^ 
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TABLE 1 








"Compiler node" 


is a 


term 


used in 


the 


following 



description. This term represents a computation which 
exists within a program and is well known and understood in 
the art of compiler technology. A further aspect of the 
present invention is that the source units are associated 
with the compiler nodes to provide a vehicle for tracking 
the source to object code correlation through the compiler 
optimization and code generation process. Each compiler 
node is associated with a set of source units that 
correspond to the node. In the process of the present 
invention, each -<ode is annotated with a source unit if the 
node represents a computation which is used to realize a 
component of the source code represented by the source unit. 
Examples are described below in reference to Figures 5 and 
7. The correspondence between source units and compiler 
nodes is tracked through each transformation and 
optimization in the compilation process. The tracking 
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through each transformation is dependent upon the type of 
transformation that is performed. 

A schematic representation of the compilation process 
as it pertains to generation of the source range table is 
shown in Figure 2. A compiler front end 64 constructs an 
initial node representation of the source program and 
initially annotates each node therein with a set of 
appropriate source units. This process is described in 
further detail below. Local and global optimizations are 
performed at each stage of the compilation process. At each 
optimization stage, the node to source unit correlation must 
be properly made. 

Following the operation of the compiler front end 64, 
there is an optimization process 66 which is in turn 
followed by basic block formation 68. The operation of the 
basic block formation 68 is to make control flow explicit in 
the node representation. Following the basic block 
formation 68 there is another step of optimization 70. This 
is followed by loop formation 72. The operation of loop 
formation adds information about compiler detected loops to 
the node representation. Next, there is a further 
optimization step 74. Following step 74, there are 
sequential operations of loop optimization 76, optimization 
78, code generation 80, optimization 82 and completion at 
the emit block 84. The multiple steps of optimization 
described in Figure 2 are well known in the industry. Such 
compiler optimization is described in, for example, Alfred 
Aho, et al, "Compilers: Principles, Techniques, and Tools", 
Addison Wesley Publishing Company, copyright 1986. 

A fundamental aspect of the present invention is termed 
a "source unit". This is a portion of source code that 
reflects the syntax of the program. Source units are 
language independent, abstracted portions of a program. For 
the process described herein, there are five classes of 
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sour.ce units. These classes, with an abbreviation for each, 
are : 

1 . routines (r ) 

2. blocks (to) 

3. loops (1) 

4. statements (s) 

5. expressions (e) . 

There are five attributes for each source unit. These 
attributes are: 

1. An index used to uniquely identify the 
source unit. 

2 . The starting and ending character position 
of the source unit in the source file. This 
is a position identification. These source 
positions are used to manipulate the text 
files and to highlight the source unit. 

3. The lexical context or scope in which the 
source unit occurs. This is termed context 
identification. The lexical scope is used 
to interpret identifiers at a particular 
point in the source file. 

4. The index of the parent source unit, if any. 
This defines linkage to any other source 
units. 

5. The class of the source unit described 
above . 

The index for a source unit is a sequential number that 
identifies the source unit. The index numbers are not 
necessarily in the sequential order of appearance of the 
source units in the source code. An example of such an 
index can be a number, such as 35. 

The starting and ending character positions of the 
source unit are identified in terms of row and column 
numbers. The source code is essentially organized as a 
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single matrix with each line corresponding to a row and the 
position within a row defined as a column. There may be 
hundreds or even thousands of rows (lines) in the source 
code. A typical column width is 80 characters, although an 
implementation may be 4,096 wide. A source unit may have a 
starting position of, for example, row 37, column 2 and an 
ending position of row 38, column 12. As noted above, this 
position identification information can be used to 
manipulate the text files of the source code and to 
highlight on a display screen the source unit under 
consideration. 

The attribute which is termed the lexical context or 
scope in which the source unit occurs is defined by visible 
identifiers, i.e., variables that are in use and available 
with respect to the corresponding source unit, in FORTRAN, 
all of the visible identifiers in a particular subroutine 
are associated with each of the source units within that 
subroutine. Referring to Figure 3 which is a FORTRAN 
routine entitled ••munge", the identifiers are a, x and "i. 
This set of identifiers represents the lexical scope and 
context attribute for the source units in the FORTRAN 
routine. But in another language, such as »c», the lexical 
scope may be only a subset of the variables within the 
entire routine. The definition of a lexical scope within 
each language is well known in the programming industry and 
is used in conjunction with the present invention as defined 
for the language of the source code being used. Lexical 
scope is defined in detail in the Aho, et al book noted 
above, specifically in section 7.4, pages 411-423. 

Parent source unit attribute is the index number for 
the parent source unit of the source under consideration. 
For the example shown in Figure 3, the parent for each 
source unit is the immediate greater source unit. There is 
only one parent source unit for each source unit. However, 
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multiple source units may have the same parent. This is 
based on a logical tree structure building up from the more 
basic source units to the more specific ones of the source 
units. As an example, referring to Figure 3, the source 
unit 7s has source unit 6b as the parent. Source unit 14s 
has source unit 6b as the parent. The parent of source unit 
9e is the source unit 7s. 

The kind of source unit attribute is the class 
identifier noted above. The class identifiers are routines 
(r) , blocks (b) , loops (1), statements (s) and expressions 
(e) . 

The above five attributes are defined for each source 

unit. 

An example of all of the attributes for a defined 
source unit can be as follows. The source unit index number 
is 35, the starting and ending positions are row 37, column 
25 and row 38, column 12 respectively, the lexical content 
is defined by the terms a, i and n, the parent source unit 
has index 24 and the kind (class) of source unit is 
statement (s) . 

Referring to Figure 3, there is shown a simple FORTRAN 
routine in which the source units have been labeled. Each 
source unit is delimited within a box. - Associated with each 
source unit is a index number and this number is positioned 
at the lower right-hand corner of each box. The index has 
two portions. The first is a sequential numerical index 
series and the second is a letter identification for the 
class of source unit, as defined above. Routine is 
represented by "r" , loop is represented by "1" , etc. For 
example, the •'end" source unit 136 has index "18s". 

In Figure 3, there is illustrated a FORTRAN subroutine 
identified as "munge (a, x, n)". The source units 
identified by source unit indexes and the corresponding 
reference numeral are as follows: 
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40 



The Source Unit Table shown in Table i illustrates the 
source units for the routine "munge" shown in Figure 3 
This table includes the source unit index, a start and end 
position for the source unit within the source code itself 
The source code listing is essentially a matrix of rows and 
columns and the starting and ending positions are the 
corresponding positions in the numbered rows and numbered 
columns. The Source Range Table (SRT) index is a reference 
to the entries in the Source Range, Table shown in Table 2 

After the source units of the source code have been 
determined, the compilation process begins by defining 
compiler nodes for the source code. There are two types of 
compiler nodes. These are (i) ent ry nodes and (2) 
computation nodes. Entry nodes are used to model basic 
blocks, which are described below in further detail. The 
entry nodes form a graph which represent the control flow in 
the program. An entry node is associated with a directed 
acyclic group of computation nodes. Computation nodes are 
used to represent computations embodied within a program 
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Referring now to Figure 4 there are illustrated entry 
nodes 150, 152, 154 and 156 and computation nodes 158, 160, 
162, 164, 166, 168 and 170. 

Figure 4 shows some of the entry nodes for the sample 
FORTRAN routine "munge" illustrated in Figure 3. At entry 
node 154, the computation nodes 158 - 170 represent the 
calculation of: 



Referring to Figure 5, there are illustrated the 
annotations for the computation nodes shown in Figure 4 . 
These annotations are for the expression a(i) = i * 3 + x, 
which is a portion of the subroutine "munge" shown in Figure 
3. For example, node 160 is annotated with source units 1, 
2, 3, 7, 8, 10, 11, and 13. The annotations for the other 
nodes are shown in the Figure. The annotations indicate 
that node 160 is specifically associated with the source 
units which annotate node 160. The annotations track the 
nodes through the compilation process. 

A principal object of the present invention is "to 
provide correlation between the source code and the object 
code to enhance the process of debugging. This is based on 
a process of annotating compilation nodes with sets of 
source units. A no.de is annotated with a source unit to 
indicate that the computation represented by the node is 
related to the source code represented by the source unit. 
The source unit annotation process consists of a group of 
major phases which follows the overall compilation process. 
These phases include: 



a(i) 



i * 3 + x 



2. 



1. 



3 . 



source unit formation 
basic block formation 
loop formation, and 



4 . 



source range formation. 
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SOURCE UNIT FORMATION 

The compilation process begins by parsing the source 
code into an abstract syntax tree. The abstract syntax 
represents the syntax and structure of the program. 
Semantic analysis takes the parse tree as input and produces 
a list of trees of nodes that reflect the logical structure 
of the program. It is during semantic analysis that the 
initial correspondence between source units and nodes is 
formed. 

Source units, as defined herein, correspond directly to 
syntactic constructs in a language, but with one significant 
exception. Source units are made in the following manner. 
For each routine, function or subroutine, a routine source 
unit is made. For each expression, an expression source 
unit is made. For each statement, a statement source unit 
is made. For each looping construct, a loop source unit is 
made, since looping constructs are typically statements in 
a language, an additional statement source unit is not made. 
Instead, loop source units are treated as statements by the 
debugger program, when it is appropriate. 

The source units are language independent, that is, 
they are based on common language features. Therefore 
source units can be defined for most languages. 

In »C» programs, a block source unit is made for each 
compound statement, m FORTRAN, a block source unit is made 
for the body of a routine or the group of statements in the 
then-clause or else-clause of a block-if statement. In both 
FORTRAN and C a block source unit is made for the body of 
each loop. The block source unit for a loop body is the 
noted exception. it is a special source unit that is 
synthesized by the compiler and plays a critical part in 
loop level optimizations. 

Semantic analysis is based on a recursive walk of the 
parse tree, forming corresponding computation nodes. See 
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Figure 4. At each level of the recursive walk, source units 
are formed as directed by the syntax of the program. 

Each source unit is pushed onto a source unit stack. 
Referring now to Figure 6, there is shown a source unit 
stack 180 for the FORTRAN routine "munge" illustrated in 
Figure 3. The source unit indexes Or, lb, 21, 6b, 7s, 9e 
and lOe correspond to the respective code elements to the 
immediate right of the source unit indexes. See Figure 3. 
The order of the source units on the stack reflects the 
nesting of source units in the source code. For example, 
consider the expression i * 3 in the procedure. "munge", see 
Figure 3. The routine source unit, Or, is placed on the 
stack first. Next, before analysis of the body of the 
routine, the block source unit, lb, is placed on the stack. 
The source unit stack at the point of analyzing i * 3 is 
shown in Figure 6. Source units that are syntactically 
nested in the program will appear on top of the source unit 
stack- Each source unit is logically stacked over its 
immediate parent source unit. 

The initial node representation of the program is 
created during semantic analysis in the compiler front end 
64. Nodes are either created directly or patterned off of 
existing nodes. New nodes must be annotated with a set of 
source units that indicate the source code which corresponds 
to the node. This can happen in one of several ways: 

(1) The new node represents a computation that 
- directly corresponds to the current syntactic construct 
under analysis. In this case the node is annotated 
with the set of all of the source units that are 
currently on the source unit stack. (See Figure 6) . 
For example, a * node will be created to model the 
multiplication in the expression i*3. This node will 
be annotated with all the source units that are 
currently on the source unit stack. For the i*3 
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expression shown on the top of the stack in Figure 6, 
the node for this source unit is annotated with the set 
of source units {0, l, 2, 6, 7, 9, 10}. The effect of 
this is to associate this node with the source units 
for the routine, body of the routine, the loop, etc. 
all the way to the multiplication term i * 3. 

(2) The new node is an auxiliary that is needed 
to help model some computation. In this case, a source 
unit annotated primary node will already exist. The 
source units from the primary node are used to annotate 
the new auxiliary node. 

(3) The new node results from some optimization. 
In this case, the node will be annotated with a set of 
source units that is a function of the source units on 
the nodes that participated in the optimization. 

(4) The new node results from copying an existing 
node into a new context. In this case, the new node is 
annotated with the union of source units on the node 
and the set of source units on the source unit stack. 
A classic example of this are nodes in the exit test of 
a "while" loop in FORTRAN. The syntactic components of 
the exit test for a loop usually occur outside the body 
of the syntactic construct for the loop. However, the 
exit test must be copied into the body of the loop. In 
order to make sure the new exit test is associated with 
the body of the loop, the new exit test nodes are 
additionally annotated with the source units on the 
source unit stack which will contain the source unit 
for the body of the loop. 

Referring now to Figure 7, there is shown in a 
composite tree 200 the annotations for all of the nodes in 
the body of the loop for the subroutine "munge" listed in 
Figure 3. This includes nodes for the two statements of the 
body as well as additional nodes used to increment the loop 
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control variable, I, and nodes to test for the end of the 
loop. The list of trees produced by the compiler front end 
64 are linked together. This is indicated by the bold lines 
202, 204, 206 and 208. Convert nodes 210, 212 and 214 have 
been inserted into the trees to explicitly represent 
conversions from integrals to reals and from reals to 
integers. These convert nodes are annotated with the same 
set of source units that are used to annotate the node being 
converted. The convert nodes are examples of auxiliary 
nodes that are patterned off of an already existing primary 
node. Note that all nodes, in the body of the loop are 
annotated with at least the subset of source units {0, 1, 2, 
6}. Source unit 6 is the block source unit for the body of 
the loop. Only nodes in the body of the loop are annotated 
with this source unit. Nodes outside the body of the loop 
will not be annotated with this source unit. This is an 
important aspect of the source unit annotation process and 
it plays an important role in loop analysis and loop level 
optimizations. 

BASIC BLOCK FORMATION 

Once the list of node trees is created, see Figure 7, 
the compiler 24 translates these trees into a graph of basic 
blocks that reflect the control flow of the program. The 
set of basic blocks forms a graph in which cycles in the 
graph represent loops in the program. 

A basic block is represented by a special type of node 
termed an "entry" node. A basic block has a single control 
flow entry point and a single control flow exit point and is 
used to group nodes together that reside in the same control 
flow path. 

Groups of nodes are then placed at the end of each 
basic block. Basic blocks are annotated with the set of 
source units that are common to ail nodes in the basic 
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block. This is accomplished by incrementally taking the 
intersection of source units of each node as it is added to 
the basic blocks. This results in associating the 
intersection of all nodes in the basic block with the entry 
node for the basic block. These source units are used later 
in tracking transformations involving nodes from different 
basic blocks and transformations between basic blocks. 

With respect to the routine "munge", see Figure 3, four 
(primary) basic blocks have been constructed. See Figure 8. 
These are routine entry block 240, loop setup block 242, 
loop body block 24 4 and end block 24 6. The routine entry 
block 240 is for the routine entry or prologue code. Any 
nodes in a routine's prologue are always annotated with just 
the source unit for the routine. As a result, the entry 
node for the prologue will be annotated with only the 
routine source unit. Next, the loop setup block 242 
contains the loop's setup code. The loop source unit and 
any higher source units will be common to all nodes in the 
basic block, so the loop setup block node is annotated with 
the routine, routine body, and loop source units. Common to 
all nodes in the body of the loop is the loop body source 
unit. The final basic block in the routine consists solely 
of the END statement block 246 for the routine. The entry 
node for the end block 24 6 is annotated with the source unit 
for the END statement (see Figure 3). 

LOOP FORMATION 

At higher optimization levels, such as -01, -02 and -03 
as performed by compilers of Convex Computer Corp., the 
compiler 24 defines loops by detecting cycles in the graph 
of basic blocks. Loops are represented by the compiler by 
placing additional information for an entry node, which 
represents the entry point to the iterative code in the node 
of the loop body. An additional item of information 
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associated with a loop is a set of source units for the body 
of the loop. This set of source units is the source code 
that is common to all basic blocks in the loop. This set is 
computed by talcing the intersection of the source unit sets 
for all of the entry nodes comprising the loop. This is 
used to track loop-level optimizations. 

Loop-level source unit annotations are used to 
initialize the basic block level source units of synthesized 
entry nodes, and to initialize the loop-level source of 
synthesized loops. 

A prototypical loop is shown in Figure 9. A loop 260 
consists of a loop preheader 2 62, loop header 2 64, loop body 
266, and a loop tail 268. The loop preheader 262 is a basic 
block that contains setup code for the loop. The loop 
header 2 64 is a basic block that is the single entry point 
into the loop. The loop body 266 includes the loop header 
264 and any other basic blocks in the loop. The loop tail 
268 is the block outside the loop to which control passes 
when the loop 2 60 is exited. The loop header 2 64 is special 
in that it is the entry node that is annotated with the loop 
source units. This is in addition to the basic block source 
unit for the loop header 264. In loop nests, the preheader 
of one loop may be the header of another (outer) loop. ' Not 
all loops have loop tails. The body of the loop may consist 
only of one basic block, the loop header. All loops have a 
loop preheader. 

SOURCE RANGE FORMATION 

Once the code has been optimized, code generation is 
performed. Code generation takes the graph of basic blocks 
and constructs a linear sequence of nodes which correspond 
directly to machine instructions. Each node represents 
instructions which are annotated with the set of source 
units to which the instruction corresponds. Once a node has 
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been defined, the process 'cf cogeneration produces the 
appropriate machine language instructions, which are in a 
binary format. The particular instructions that are 
generated are dependent upon the type of node as well as the 
type of computer which will run the machine instructions. 
Code generation of machine language instructions from nodes 
as defined herein, is well Known in the art of compiler 
technology. 

However, this is not a compact form that can be easily 
used by the debugger program 30. The preferred way to 
communicate source to object code correlations to the 
debugger is through source ranges. 



BNSDCCID; <WO__932S963A1_l_, 



SUBSTfTUTE SHEET 



WO 93/25963 



PCT/US93/05227 



24/1 



0) . 

c o 



:=> 
2: 

































0 






O 




0 


0 
























CO 


O 


CO 
























CM 


in 






in 




CM 


no 








• <— ( 


u 






u 




• •— t 


- r~i 


2 


< 


2 


r~ 


0- 


m 


.—4 


r*- 




r« 






v£ vo vd 




r- 




r- 








r- co r- 





0) 

ro 
2 

CD 
T3 

O 
2 



cc cc 

u a cc 
I IE- 

j m w 

JD 2 

^3 3 

2 00 

O 



CO 

00 

or: cc cc 

£" S3 U E-« Q 

2WHt-fc.2D 
U D 2 U Ht] < 



W Q 
CO Q 

n < 



E-« E- 
00 CO 
2 2 

o o 
h u u 
fa. t I 



CC 



cc 



H U D Q h W h 

a: w w o 2 00 2 



2 a a 

_ _ O Q Q 



cc 
a m 
cc > > 
2 
o 



0000 



o o 



(N M H O 



O O t-l 



4-1 ro 
to -u 

C (0 



CO 



8 

CO 



u. en 
to ro 

os o 



CO 

o 

P 

o 
p 



*fc ^fc *fc 
0000 

W 10 10 3 

d. o o o 
fO co p p 

0000 
P ^ 3 P 













< 


00 








CO CO 


00 




1 


>< 1 


2 E-* 






1 s 




O 


1 




I* CO 


u 


CQ 


1 0 


1 cc 


Qu 


r? 


D W 


W DQ 


O 


00 


»J 2 


■J D 





*t 


**= 


4fc 




3t: 


3*= 


3*: 


4fc 






:**= 


CM 


ro 


CM 


m 


CO 


OJ 


ro 


i-H 


0 




CM 


(M 


ro 


(0 


ro 


fO 


CO 


CO 


fO 


w 


10 


CO 


fO 


CO 


a 


a 


0 


O 


0 


0 


CO 


a 


ro 


r\j 


0 


O 


to 


(0 


P 


P 


p 


p 


(0 


ra 


to 


to 


P 


P 


0 


0 


0 


O 


0 


0 


0 


0 


0 


O 


0 


O 


5 


p 


P 


P 


p 


p 


p 


p 


P 


P 


P 


P 



< 

< < 2 
X >< I 
I 1 3: 
3 S I 

1 1 a 

O Q Q 
J J < 



< 

2 

I 

u, 
X 
00 









co 


00 




00 


< 




00 


00 < 


00 


00 < 


00 


1 


1 2 


2 


2 

*' 


2 t 
1 S 




00 


00 

1 S 






S 1 




*' 


2 1 




D 1 


1 Q 


1 




e o 


Q 




a a 


a 


> 




Q 




j < 






u < 


< 



CM 

w 

•J 

n 



i-3 



coou«*uuuuuu 
J2 JD co <sO ao ^t< u u o 

mm com h okj^ cMn 00 



^ co r- 



U^}« U «^ U U T3 ^ UOCOOtM 

rooco a; X) OJvwcTtx) ro a> rsi o 



c 

0) 

u 
u 
p 
o 

00 



CO rH 

c o 
o o 
•-• o 

*J o 
(0 O 
xJ O 
O O 

c <p 
c o 
< o 



CO 

r- 

V£> 

in 
ro 

CM 

<— 1 
o 

CTi 
CO 

r- 

in 
*r 
ro 

CM 



00 



co 

a 
00 



o cc 





























00 




00 




00 




CO 




























CQ 


CQ 


CQ 


CQ 


CQ 




■-J 












J 


















J 


*J 




CQ 


03 


CQ 


CQ 


CQ 


cq 


CO 


CQ 


CQ 


CQ 


m 


CQ 


CQ 


m 


CQ 


CQ 


CQ 


CQ 


CQ 


(X 


a: 


CC 




cc 


CC 


cc 


CC 


CC 


CC 


cc 


CC 


CC 


cc 


CC 


CC 


CC 


CC 


CC 



S325963A1 I > 



WO 93/25963 



PCT7US93/05227 



24/2 



u 



0) • 

c o 



^ < x 7 

r-r-r-oor-rv|-r-ooHHnr> 



0) 

0) 

O 
2 



D 
Q 
< 



> 

O 
U 



£ ^ 2 
a: a: ^ 

t£ , U h U h 



Dl) 

15m 

2 
2 W 



m o o o o o in 



4J CO 

C CO 
*-h Q 



CD T3 
j-> C 

w ro 
•~ < u 

oj a 

OS o 



<u 

O 
U 

a 
o 



3t: =tt *t *fc 4fc 

<"*"> o o i— i no o o 

VI 10 10 V] (0 V] ? 

O H O ^ (N OJ O 

3 W W tfl «J (0 O 

o o o o o o o 

3 => 3 O D P *3 



W X 

w w w w 

2W ICO < X I 
I 13: l< CO H 

S w In 111 
I IW 13 2 < 

q a e-« j i iq; 

Q Q > D h h D 

< < u z j co n 



w 

a 
m 

CD 
3 



co 



co 



o 
o 

D 

O 
3 



T3 
CM (U 

W C 
■J 

< c 
o 
u 



2 

a: 



ai 
X) 



^ uuu U U CO u 
^ -Q u^co UOUC00^jQ_0 

c*i fli in r » L, i *m /-v . . "~™ 



<u to u ^ m-o 13 u (nSId 

1^ CTMD (-» U"> CO LO Lf) 



CTi CP. 

in m 



-rf «3i «4<*Ji^t«3f ^ ^ ^j, 



4-> UJ 

c 

c O 

U 4-> 

J- o 

3 C 

o c 
co < 



«— t 00 

i— i r- 

i— I vo 

r-H in 
rH ^ 

i— I ro 

r-f fN 
r-t I— I 
rH O 

o cn 

O CO 

o r- 

O KO 

o in 

o 

o m 

O r\! 

o i— i 

o o 



CO co 



CO 
CO 



U CO 

co co co 

CD. CQ CD CO Q0 PQ CO 



CO 



COCQOOQfflCQCQQOmCQffi 



0 O 

< S 

3 O 

cq a, 

— « —i i 

01 \ a> 



\ m 

CT> <n (J 

•- r» O 

D PJ 'J 

C DO 

CT o a 



BMSDOCID: <WO 932S963Al_l_> 



WO 93/25963 



PCT/US93/05227 



25 

Source ranges, which include information from the 
source range table 42, see Figure l, are shown in detail in 
Table 2. This Table illustrates the nodes of the routine 
"Biunge" at code generation. Each line in Table 2 
5 corresponds to a node in the compilation graph. See Figure 

7. In the "Source Unit Annotations'* section of Table 2, 
each column is associated with a single source unit. The 
source unit annotations are depicted by a placing a letter 
indicator in the column reserved for a source unit. The 

10 letter indicates the type of source unit as described above. 

For example, if source unit 6 is associated with an 
instruction, then the letter "B" will appear in column 6 on 
the line for the instruction. For example, this is the line 
having label M 5 4f5bc0" . 

15 Instead of emitting the annotations for each 

instruction, zero, one or more source ranges are emitted for 
each source unit. This is an integral part of the 
annotation process. A source range specifies a range of 
object code instructions that are associated with a source 

20 unit. This is done by specifying the starting and ending 

positions of the range of instructions associated with a 
source unit. The range is defined by a start and stop 
program count (PC) which is an incremental numerical for 
each of the instructions in the object code. Source ranges 

25 can be visualized by reference to Table 2. Each group of 

vertically contiguous annotations results in a single source 
range. For example, the source unit in column "0" is class 
"R" , and this corresponds to the source index "Or" in 
Figures 3 and 6. This source unit (OR) has a range from 

30 label "l 4f59b8" to label "if 4f5abc". Similarly, the 

source unit lb in column "1" has a range from label "f 
595764" to label »le 4f824c". There may be multiple source 
ranges associated with a particular source unit. The widely 
disparate scattering of instructions resulting from 
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optimization associated with a single source unit can be 
easily accommodated in this fashion. 

A single machine instruction (Opcode) can map to one or 
more source units. For example, Opcode »CVTS_W__SS" maps to 
the source units in columns 0, i, 2, 3, 6, 7 and 9. 

At the tail end of code generation, the linear sequence 
of nodes is scanned and the corresponding source ranges are 
formed. This is performed by the compiler 24 in the 
production of the assembly code 26. This is indicated by 
the directive shown by line 27 to the assembler 38, see 
Figure l. The assembler 38 resolves the object module 
relative positions of the starting and ending instructions 
in the source ranges and emits them in a corresponding 
source unit index to the source range table 42. The source 
range table 42 (Table 2) becomes a data file which is 
provided to the debugger program 30. 

OPTIMIZATION 

The present invention is integral with the optimization 
of code in the compilation process. The source unit sets 
must be tracked at each transformation of a node used to 
effect some optimization. Based on the effect of a 
transformation on the source unit annotations, . the 
transformations are divided into the categories of node: 
replication, merging, motion, elimination, expansion, 
replacement, and reordering. Each of these is discussed 
below. 

REPLICATION 

30 Node replication occurs when a new node is 

patterned directly off of an existing node. In this case, 
the source unit set associated with the existing node is 
simply propagated to the new node. 
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MERGING 

Node merging occurs either when a new node is 
synthesized out of multiple existing nodes or when a group 
of existing nodes are merged into a single node. In this 
case, the source unit set of the resulting or surviving node 
is annotated with the union of the source unit sets from the 
merged nodes. Node merging occurs in the optimizations 
comprising assignment substitution, common subexpression 
elimination and redundant use elimination. 

MOTION 

Node motion occurs when a node is moved out of one 
basic block into another basic block. In this case, the 
moved node is re-annotated by subtracting out from the node 
the source units of the original basic block of the node and 
adding in the source units of the basic block into which the 
node is placed. Node motion occurs in code motion, hoist 
and sinking, and partial redundant subexpression elimination 
and is an integral component of a variety of other 
transformations . 

ELIMINATION 

Node elimination occurs when nodes are eliminated from 
the graph of nodes. When a node is eliminated it does not 
affect the source unit annotations of other nodes. This is 
a deep property of this process. Node elimination is a by- 
product of many transformations and occurs in dead code 
elimination. 

REPLACEMENT 

Node replacement occurs when an existing node is 
replaced by an equivalent group of newly synthesized nodes. 
Node replacement occurs in constant propagation, folding, 
algebraic and trigonometric simplification, strength 
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reduction, and the node expansion that occurs as a part of 
code generation. 



10 



REORDERING 

Node reordering occurs when the data flow or control 
flow ordering in between a group of nodes is rearranged. In 
this transformation, no manipulation of the source unit 
annotations need take place. This is a deep property of the 
process. Node reordering occurs in instruction scheduling. 

Table 3 is an illustration of a Source Range Table in 
accordance with the present invention. 



SOURCE RANGE TAUT .P. 



15 



20 



25 



30 



35 



SRT INDEX 
0 

1 
2 
3 
4 
5 
6 
7 
8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 



SU INDEX 

1 

0 

2 

5 

5 
14 

9 
16 
13 
15 

7 

6 

9 

10 
7 
17 
14 
15 
7 
7 
9 



FILE 



START 


END 


68 


142 


68 


142 


68 


140 


68 


72 


84 


88 


104 


108 


104 


108 


104 


108 


104 


108 


104 


108 


104 


108 


108 


136 


108 


110 


108 


110 


108 


110 


110 


112 


110 


112 


110 


112 


112 


114 


122 


126 


122 


126 



BNSDOCID <WO 9325963A1_1_> 



WO 93/25963 



PCT/US93/05227 



29 



21 

22 
23 
24 
25 



14 
15 

7 

14 
18 



126 
126 
130 
136 
140 



128 
128 
134 
140 
142 



TABLE 3 



10 



15 



20 



25 



30 



Table 3 is an example of the table 42 in Figure 1 and 
is used to map between executable source units (source code) 
and corresponding instructions in the executable image 
(object code) . An entry in the source range table consists 
of a source unit index, and a start and end instruction 
address in the executable image. There may be multiple 
address ranges for a source unit due to optimization (e. g. 
instruction scheduling which interleaves code for an 
expression) . Given an instruction address, the table is 
used to determine which source units are active. This 
mapping is used, for example, to determine which source 
units to highlight at debugger eventpoints. Given a source 
unit, the table is used to determine the source unit's 
address ranges. This mapping is used for stepping and the 
setting/unsetting of breakpoints, eventpoints , and 
tracepoints. The elements of the. Table are a source range 
table (srt) index which is an incremental numerical listing 
of the entries in the source range table, a source unit (su) 
index which is the index number for each source unit, a file 
number which relates to the one of multiple object files 
which may comprise the executable file, a start program 
count (spc) address for instructions in the object file and 
an end (program count) number for the last instruction in 
the range of instructions in the object code. Referring now 
to Figure 13, there is illustrated an overall view of the 
mapping of source units in the source code to the machine 
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instructions in the object code. This Figure includes a 
listing of source code 300, a source unit table 302, a 
source range table 3 04 and an object module (object code) 
306. The source code 300 consists of the matrix of the 
original source code. The source unit table 3 02 includes 
only the source unit index along with the start and stop 
positions of the source unit in the source code 300. The 
source range table 3 04 includes the source unit index and 
corresponding start and end program count <pc) numbers which 
correspond to the appropriate program count numbers for the 
machine language instructions in the text section of the 
object module 3 06. 

Further referring to Figure 13, the beginning and 
ending of a particular source unit is indicated by the dots 
310 and 312. The matrix positions of the dots 310 and 312 
correspond to the start and end positions 314 and 316 in the 
source unit table 3 02. The arrows between dots 310 and 312 
and positions 314 and 316 indicate the one-to-one 
relationship of these entities. 

The source unit indexes are the same in both the source 
unit table 302 and the source range table 304. 
Corresponding indexes are indicated by the reference 
numerals 318 and 320. The interconnecting arrow shows the 
direct relationship between these two index entries. 

A selected start PC 322 and end PC 324 are shown in 
source range table 304. Machine instructions 326 and 328 
define a range of instructions in the object module 3 06. 
The start PC 3 22 corresponds to the instruction 32 6 and the 
end PC 324 corresponds to the machine instruction 328. The 
interconnecting arrows indicate the one-to-one relationship 
between the start and end PCs and the machine instructions 
in the object code. 

The operation of the present invention, in certain 
aspects, can be summarized in reference to Figure 13. The 
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source code 3 00 is parsed, as described above, to produce 
the source units which are listed in the source unit table 
302. This produces a set of source units which correspond 
to the entirety of the source code 3 00. The compilation and 
5 optimization procedures are then carried out upon the source 

code. In this process, compilation nodes are created and 
these nodes are annotated by use of the source unit indexes. 
Through various compilation and optimization procedures, new 
nodes are created, extended or deleted, as described above. 

10 In the last step of code generation, the object module is 

produced which consists of the machine language instructions 
corresponding to the defined nodes. The resulting machine 
language instructions are associated with the source unit 
annotations for the nodes which were used to produce the 

15 machine language instructions. This information is stored 

in the source range table 3 04. 

It can therefore be seen that between the source code 
300 and the object module 306, there is a direct mapping 
back and forth between the source units and the source code 

2 0 and the individual instructions in the object code. For the 

example shown, the source unit 2 is located between the 
starting and ending positions indicated by dots 310 and 312 
in the source code 300. This source unit index is reflected 
in the source unit index in the source range table 3 04. 
25 Within the object module, this particular source unit index 

(2) is executed by instructions in the range beginning with 
instruction 32 6 and ending with instruction 32 8 in the 
object module 3 06. The information in the source unit table 
302 and the source range table 304 is utilized by the 

3 0 debugger program 3 0 (Figure 1) to indicate which source unit 

of the source code corresponds to a particular machine 
instruction being executed by the computer which is 
processing the object module 306. 
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In practice, optimizations are realized by sequences of 
these node transformations. Often, components of these 
sequences can be optimized away, leaving implicit source 
unit annotations that must be explicitly handled. 

The process of performing operations on sets of source 
units to track source to object code relationships is 
described above. However, a number of classes of 
optimization are particularly applicable to the present 
invention and are further described herein. These relate to 
code motion. This is described in reference to Figures 10- 
12. 

Code motion is an optimization that refers to the 
process of moving loop invariant computations which are 
inside a loop, to a position outside the loop. When this 
occurs, the moved computations are disassociated with the 
computation of the body of the loop. For compiler loops 
resulting from language-level loop constructs like DO 
statements, there will always be a loop body block source 
unit that is associated with each node in the body of the 
loop. 

The loop-level source unit set will include this source 
unit representing the body of the loop. When a node is 
moved outside the loop, the loop-level source units, 
including the body source unit, are subtracted out of the 
source unit set for that node. The node is then placed in 
another basic block, and the source units of this basic 
block are added to the node's source unit annotations. 

An example of code motion optimization is presented in 
Figures 10, li and 12 for a DO loop. Figure 10 is a DO loop 
listing which has boxes to identify the source units 
thereof, as in Figure 3. Figure 11 is a table of PRE-CODE 
MOTION which shows the source unit annotations before code 
motion and Figure 12, POST-CODE MOTION, shows the source 
unit annotations after code motion. Reference is made in 
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particular to the rows shown in bold in each of Figures 11 
and 12. The loop of interest is depicted graphically with 
an arrow 280 from the instruction at the end of the loop to 
the first instruction of the loop. The reference to X is 
loop invariant and is hoisted out of the loop. See X in the 
statement in Figure 10. Note that before code motion 
(Figure 11) , the reference to X is annotated with source 
unit 6. The source unit 6 is the block source unit for the 
iterative body of the loop. After code motion (Figure 12) , 
the reference to X is no longer annotated within the source 
unit 6. 

As a component of many optimizations, the compiler 24 
synthesizes a variable to communicate a value from a point 
in one basic block to points in other basic blocks. When a 
variable is synthesized to communicative value, an 
assignment of the variable must be made in some block and 
uses of the variable will be introduced in other blocks. 
The node representing the assignment is annotated with the 
source units of the node representing the value. That is, 
the assignment is viewed as part of the computation of the 
value. However, the uses that are introduced are not viewed 
as part of the computation of the loop. Instead, they are 
viewed as part of the computation which utilizes the 
variables value. 

A still further aspect of the present invention 
comprises a method for "visualization" in the debugging 
process. This is described in reference to Figures 14a, 
14b, 15a and 15b. As described above, the present invention 
relates each source unit in the source code to a range of 
instructions in the object code. See the overview 
description in Figure 13. To assist a programmer in the 
debugging operation, the present invention provides display 
correlation between the machine instruction which is either 
to be executed or is in execution, or has just been executed 
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with the source units which are designated to have a range 
which includes that particular machine instruction. 
Referring to Figure 14A, there is illustrated a sequence of 
machine instructions comprising a portion of the object 
module. An instruction 350 is noted as being in execution 
by the processor. Instruction 350 is shown in a window 352 
of a display 354. This instruction, and the related 
instructions, are a portion of the object module 
corresponding to the FORTRAN routine "munge" listed in 
Figure 3. In accordance with the present invention, the 
instruction 350 is within the range for the source unit I7e 
(see Figure 3) . Referring now to Figure 14B, it can be seen 
that the source unit I7e is the variable x. This source 
unit (x) is highlighted in a window 360 on a display 370. 

The window 352 for the object code, and the window 3 60, 
for the source code, may be simultaneously displayed on a 
single display screen for the convenience of the programmer. 

A still further illustration of the display aspect of 
the present invention is shown in Figures 15a and 15b. In 
Figure 15a, there is shown a listing of object code with a 
particular instruction 380 which is in execution in a 
processor (computer). The object code listing is in a 
window 382 of a display 384. Referring to Figure 15b, there 
is shown a segment of the routine "munge" with a highlight 
for an entire loop source unit. This is shown in a window 
388 of a display 391. As noted above, the windows 382 and 
388 may be on the same display screen. 

The source code source unit shown in window 388 of 
Figure 15b is mapped, as shown in Figure 13, to a range of 
instructions in the object code which includes instruction 
380 shown in Figure 15a. when the instruction 380 is in 
execution or has immediately been executed, the 
corresponding source unit in window 388 is highlighted to 
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indicate to the programmer the particular source code which 
is being executed. 

The machine language instructions in execution, as 
shown in Figures 14a and 15a, may also be highlighted. 

Figures 14a and 15a further illustrate a listing of 
machine language instructions with a corresponding program 
counter (PC) numbers. Referring to Figure 14a, the machine 
instructions are in the column which include the entries 
NEG.W, LE.W, BRS.D and LD.W. At the top of the figure there 
is shown a program count as indicated by the phrase 
pc= ( 0x8 000137 8) . This corresponds to the top instruction in 
the window 352. This is the instruction Q 8 (ap) . Each of 
the lower, sequential, machine instructions have a program 
count which is incremented by one unit below the noted 
instruction. 

Referring to Figure 15a, there is also illustrated a 
listing of machine instructions in the center column with a 
program count at the line 380. This is for the instruction 
"ADD. W" . The program count for this instruction is 
pc= (0x80001382) . Each of the instructions above line 380 
will have a program count incremented by one space from that 
for line 380 and those instructions below line 380 will have 
incremented program counts for each step likewise. 

A flow diagram illustrating the operation of the 
present invention is shown in Figure 16. This illustrates 
the sequential steps carried out in conjunction with the 
compilation process to generate the tables used for 
providing source to object code correlation. The source 
file 22 is provided to operational block 390 in which 
lexical analysis of the source file is carried out to 
generate tokens. Next, in operational block 3 90, the source 
file is parsed to generate a parse tree. The operations 
carried out in blocks 390 and 392 are conventional in 
compiler technology. 
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Following block 392, an operation is carried out in 
block 394 to process the parse tree and generate source 
units. The source units have been described in detail above 
and the methods of generating the source units have also 
5 been described. An example of a routine with source units 

is shown in Figure 3 . 

After the source units are generated in block 394, 
operation is transferred to a block 396 in which the source 
units are provided to the debugger program 30. This is in 
10 the forin of the source unit table 28 described above. The 

table 28 is preferably a file provided to the debugger 
program 30. 

The source code is processed at the beginning stage in 
the block 398 in which there are generated initial node 
representations for each source file. Further, each node is 
annotated with the corresponding source units. Continuing 
further in the compilation process, in block 400, the source 
file code is optimized with various node operations, 
described above, including creations, merging, deletions and 
20 so f °rth. Each new node is annotated with the related 

source units, in a question block 402, .following block 400, 
a determination is made whether additional optimization 
steps should be performed. if the optimization is not 
complete, control is transferred through a line 404 to 
implement additional optimizing processes in block 4 00. 
After optimization has been completed, control is 
transferred from question block 402 to operational block 406 
for code generation. The nodes produced as a result of the 
compilation and optimization are used to produce machine 
language instructions and each of these instructions is 
annotated with the source units for the node which produced 
the instruction. 

Following block 4 06, entry is made to operational block 
408 to generate a source range table, such as table 42, from 
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the annotated machine instructions and this table is 
provided to the debugger program 30. Finally, in block 410, 
the binary machine language instructions are produced for 
the executable file. The process is completed at an end 
step 412. 

Referring now to Figure 17 there is illustrated the 
operation of the present invention for utilizing the source 
unit tables and source range tables described above. The 
executable file 46 is used by an operational block 43 0 
wherein the file is provided to a processor for execution 
for debugging of the program being executed. The debugging 
operation provides sequential execution of the machine 
language instructions in which the programmer monitors the 
operation of the program to locate and resolve problems . 

In operational block 432, a sequence of the machine 
language instructions are displayed and this includes at 
least the particular instruction for execution in the 
processor. See Figure 14a. 

In operational block 434, the executed instruction is 
examined to determine the program count (PC) for that 
instruction. This is related to the source range table to 
determine the program count ranges in which the executed 
instruction is included. This identifies the corresponding 
source unit indexes. Continuing in operational block 4 36, 
the identified source unit indexes are related to the 
corresponding indexes of the source unit table which in turn 
identifies the start and end positions of the corresponding 
source units in the source unit table. Continuing to 
operational block 436, the corresponding source units are 
displayed as shown in Figures 14b and 15b to highlight the 
source units that correspond to the machine instruction 
which was executed. There may be one or more highlighted 
source units which relate to a particular machine 
instruction. 
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In summary, the present invention is a method for use 
with a debugger program to correlate units of source code 
with instructions in object code in a compiler environment 
wherein there may have been one or more steps of 
optimization that can produce a complex relationship between 
the source code and object code. This is carried out by 
processing the source code to define source units in the 
code, producing compiler nodes which are annotated with the 
source units and then generating instructions that are 
defined within ranges for each source unit. Further, when 
the machine instructions are in the process of execution, 
the present invention can display the particular source 
units and corresponding machine instructions being executed 
by the processor. 

Although only one embodiment of the invention has 
been illustrated in the accompanying drawings and 
described in the foregoing Detailed Description, it will 
be understood that the invention is not limited to the 
embodiment disclosed, but is capable of numerous 
rearrangements, modifications and substitutions without 
departing from the scope of the invention. 
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claims 

What we claim is: 

1. A method for establishing a relationship between 
machine language instructions in object code with elements 
of source code which was compiled to produce the object 
code, the method comprising the steps of: 

processing said source code to produce therefrom a 
plurality of source units which reflect the syntax of said 
source code wherein each source unit corresponds to an 
element or set of related elements of said source code, 

processing said source code, in conjunction with said 
compilation, to produce compiler nodes, 

generating an annotation for each of said compiler 
nodes to identify the ones of said source units related to 
eacK of the compiler nodes, 

producing said machine language instructions by use 
of said compiler nodes, and 

defining ranges of said machine language instructions 
to establish through said annotations a correspondence 
between the machine language instructions in each of said 
ranges and ones of said source units which are related to 
the machine language instructions in each of said ranges. 

2. A method for establishing a relationship between 
machine language instructions in object code with the 
elements of the source code as recited in Claim 1 wherein 
the steps of processing said source code to produce 
compiler nodes and generating an annotation includes the 
steps of : 

. logically stacking related ones of said source units 
wherein each source unit in said stack is immediately 
above it's immediate parent source unit, and 
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generating one of said compiler nodes for each source 
unit in said stack and annotating the generated compiler 
node with indexes for that source unit and each of the 
source units below the source unit in the stack for which 
the compiler node is generated. 

3. A method for establishing a relationship between 
machine language instructions in object code with the 
elements of the source code as recited in Claim 1 wherein 
said step of processing said source code to produce 
compiler nodes includes generating computation nodes which 
reflect computations in said source code and generating 
entry nodes which reflect the flow of said source code. 
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4. A method for establishing a relationship between 
machine language instructions in object code with elements 
of corresponding source code, the method comprising the 
steps of: 

processing said source code to produce therefrom a 
plurality of source units which reflect the syntax of said 
source code wherein each source unit corresponds to an 
element or set of related elements of said source code, 

processing said source code with optimizing processes 
to produce compiler nodes, 

generating an annotation for each of said compiler 
nodes to identify the ones of said source units related to 
each of the compiler nodes, 

producing said machine language instructions from 
said compiler nodes and annotating each said produced 
machine instruction with the source units annotated for 
the corresponding compiler node, and 

defining ranges of said machine language instructions 
to establish through said annotations a correspondence 
between the machine language instructions in each of said 
ranges and ones of said source units which are related to 
the machine language instructions in each of said ranges. 

5. A method for establishing a relationship between 
machine language instructions in object code with the 
elements of the source code as recited in Claim 4 wherein 
the step of generating an annotation arid processing said 
source code to produce compiler nodes includes the steps 
of: 

logically stacking related ones of said source units 
wherein each source unit in said stack is immediately 
above it's immediate parent source unit, and 

generating one of said compiler nodes for each source 
unit in said stack and annotating the generated compiler 
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node with indexes for that source unit and each of the 
source units below the source unit in the stack for which 
the compiler node is generated. 
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6. In a debugger program for use with object code 
produced by optimized compiling of source code wherein the 
debugger program provides correlation between machine 
instructions of the object code and elements of the source 
code, the improvement comprising: 

a source unit table comprising a plurality of source 
units each of which includes a unique index, a position 
identification in said source code, a context 
identification of said source unit in said source code, a 
linkage to other source units, and a class identification 
of the source unit, and 

a source range table which specifies one or more 
ranges of said machine instructions which are associated 
with each of said source units, wherein each element of 
said source code can be related to corresponding ones of 
said machine instructions through said source units and 
said source range table. 
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7. A method for establishing correlation between 
object code and corresponding source code, comprising the 
steps of: 

processing said source code to produce therefrom a 
plurality of source units each of which includes a unique 
index, a position identification in said source code, a 
context identification of said source unit in said source 
code, a linkage to other source units, and a class 
identification of the source unit, 

processing said source code to produce compiler 
nodes, which include entry nodes and computation nodes, 

generating an annotation for each of said compiler 
nodes to identify the ones of said source units related to 
each of the compiler nodes, 

producing said machine language instructions by use 
of said compiler nodes, and 

generating a source range table which specifies one 
or more ranges of said machine instructions which are 
associated with each of said source units, wherein each 
element of said source code can be related to 
corresponding ones of said machine instructions through 
said source units and said source range table. 
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8. A method for use in a debugger program for 
indicating a unit of source code which corresponds to a 
selected object code machine language instruction in a 
computer, wherein the object code was produced from the 
source code through compilation, the method comprising the 
steps of: 

processing said source code to produce therefrom a 
plurality of source units which reflect the syntax of said 
source code wherein each source unit corresponds to an 
element or set of related elements of said source code, 

processing said source code, in conjunction with said 
compilation process, to produce computation nodes which 
correspond to computations in said source code, 

generating an annotation for each of said computation 
nodes to identify the ones of said source units related to 
each of the compiler nodes, 

producing said machine language instructions by use 
of said compiler nodes, 

establishing ranges for said machine language 
instructions corresponding to said compiler nodes to 
thereby establish through said annotations a 
correspondence between the machine language instructions 
in said object code and said source units in said source 
code, 

determining by use of said correspondence the ones of 
said source units which correspond to said selected 
machine language instruction in said computer, and 

displaying a sequence of said source units of said 
source code on a display and highlighting one or more of 
said ones of said source units which correspond to said 
selected machine language instruction. 
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9. A method for establishing a relationship between 
machine language instructions in object code with the 
elements of the source code as recited in Claim 8 wherein 
the steps of processing said source code to produce 
compiler nodes and generating an annotation includes the 
steps of: 

logically stacking related ones of said source units 
wherein each source unit in said stack is immediately 
above it's immediate parent source unit, and 

generating one of said compiler nodes for each source 
unit in said stack and annotating the generated compiler 
node with indexes for that source unit and each of the 
source units below the source unit in the stack for which 
the compiler node is generated. 

10. A method for establishing a relationship between 
machine language instructions in object code with the 
elements of the source code as recited in Claim 8 wherein 
said step of processing said source code to produce 
compiler nodes includes generating computation nodes which 
reflect computations in said source code and generating 
entry nodes which reflect the flow of said source code. 



BNSDOCID: <WO.__9325963A1_l_> 



WO 93/25963 



PCT/US93/05227 



1/14 




BNSDOCID: <WO 9325963A1J_> 



WO 93/25963 



PCT/US93/05227 



2/14 



COMPILER 
FRONT END 






OPTIMIZATION 



68 



BASIC BLOCK 
FORMATION 







OPTIMIZATION 


— 1 



LOOP 
FORMATION 



72 



OPTIMIZATION 



74 



LOOP 
OPTIMIZATION 



-76 



n 



optimization 



78 



OPTIMIZATION 



82 



FIG. 2 



CODE 
GENERATION 



-80 



EMIT 



Y 



84 



I INI 



subroutine munge (o.x.n) 
integer a(n) 106 

6^1^110 

LHJ c 120 125 126 118 1 ] 4 

— '_5e — , ; , / 



do 



4e 



122 



- til i2e S 



+ 

J 10e 



13e 



x = | x * 
1 ■ — 'te e 



17e==3 



15e 



•134 



enddo 



128 130 



i 



L^136 



i!^i 8s 



F/C. 3 



9e 




i lb 



'Or 



BNSDOCID: <WO 9325963A1_I_> 



WO 93/25963 



PCI7US93/05227 



ENTRY 
NODES 




FIG. 6 



STACK 
GROWTH 





180 

; 


10e 


i * 3 


9e 


i * 3 + x 


7s 


o(i) = i * 3 + x 


6b 


o(i) = i * 3 + x .. 


21 


do i = 1...endo 


lb 


do i = 1...endo ... 


Or 


subroutine munqe . 



BNSOOCID; <WO 932 5 963 A !_!_> 



WO 93/25963 



PCI7US93/05227 



4/1 4 



200 




j0.1,2,6.14,15.17| N 

214 I 
/ 



C0NVERf > }0.1.2.6.14.15.17| / 

/ 

/ 



FIG. 7 



BNSDOCID: <WO _932S963A1 I 



WO 93/25963 



PCI7US93/05227 



240- 



242 



5/14 

ROUTINE 
ENTRY 

zn 



244 -x. - 



LOOP 
SETUP 

r: 



246 



LOOP 
BODY 



END 



260 



j0.1.2i 



|0.1.2.6| 



!0.1.18i 



FIG. 8 



262 
264 

jLOOP LEVEL SOURCE UNITS] 



LOOP 
PREHEADER 



I 



LOOP 
HEADER 




jPREHEADER SOURCE UNITS} 



jHEADER SOURCE UNITS) 



{TAIL SOURCE UNITSi 



FIG. 9 



DO I = 1,N 



A(l) = |D * (||x; + fYjj) 



ENDDO 



FIG. 1 0 



BNSDOCtD <WO 932S963A1J_> 



WO 93/25963 



PCT/US93/05227 



6/1 4 



PRE-CODE MOTION 



SOURCE UNIT 
ANNOTATIONS 



00000000001 11111111 
01 2345678901 2345678 



NODE 
IDENTIFIER 



NODE TYPE AND INPUT & OUTPUT ARCS 



R 
R 

[■ 
RB 
RB 
RB 
RB 
^ RB 

RB 
RB 
RB 
RB 
RB 
RB 
RB 
RB 
RB 
RB 
RB 
RB 
RB 
RB 
RB 
RB 
RB 
RB 
RB 
RB 
28Q RB 

[• 



V 



B 

BS EE 
BS EE 
B 
B 

BS E 

BS 

BS 

B 

B 

B 

B 

B 

BS E 
B 

BS E 

B 

B 

BS 

B 

B 



SE E 
SE E 
ESEE 



SE 



S 
S 



N04 f 9 o 1 0 : 
N04 f o4 o4 : 

N04 f b2 24 : 
N05o350c: 
N05o320c: 
N05g36Bc: 
N04 f o028: 

N05o35cc : 
N04 f bo64 ; 
N05o32cc : 
N04 f bee4 : 
N04 f b9o4 
N05o2 f cc : 
N04 f d264 : 
N05o31 4c: 
N05a344 c : 
N05o374c: 
N05o308c: 
N05a338c : 
N05o2 fOc: 
N04 f bco4 : 
N04 f bb24 : 
N04 f o f e4 : 
N04 f bbe4 : 
N04 f b524 : 
N04 f b 764 : 
N04 f d564 : 
N04 f 9dl c: 
N04 f c2o4 : 



CN_ENTRY 
CN_GOTO 



SUCCS: N04f9808 



? i 2 = N04 f e0o4 

?i 4 = N04 f ceo4 

? i 3 = N04 f dc24 

?c5 = N04 f d f 24 

CN_ENTRY in N04foO28 SUCCS 

N04 f o028 N04 f 9 d 1 c N05o38cc 

?i 4 

CN_CONVERT N05o32cc 
?i 3 

CN_CONVERT N05o32cc 
X 

?i 2 

N05o2 fee 

N05o32cc 

N05q35cc 
?c5 

?i 3 = N05o31 4c 
?i 4 = N05a344 c 

N04 f bo64+N04 f b9o4 

N04 f b9o4 *N04 f bee4 
CN_CONVERT N05o2fOc 
CN_LT N04 f d264 N05o374 c 
X = N04 f bco4 
A [N04 f d264 ] = N04 f bb24 
? i 2 = N04 f d264 
CN_ I F N04 f o f e4 
CN_ENTRY SUCCS : N04 f990c 
CN__ RETURN 



FIG. 1 1 



BNSDOCID: <WO 9325963A1_I_> 



WO 93/25963 



PCT/US93/05227 



7/1 4 



POST- CODE MOTION 

SOURCE UNIT NODE NODE TYPE AND INPUT k OUTPUT ARCS 

ANNOTATIONS IDENTIFIER 
00000000001 111 11111 
01 2345678901 2345678 

R N04f9o10: C N__ E N T R Y SUCCS: N04f9808 

R N04fo4o4: CN_GOTO 



[• ■ • 

RBL 


] 








N04 f b224 : 


? i 2 = N04 f eOo4 


RBL 










N05o350c : 


?i 4 = N04 f ceo4 


RBL 










N05o320c : 


?i 3 = N04 f dc24 


RBL 










N05o368c : 


?c5 = N04 f d f 24 


RBL 










N04 f o028 : 


CN ENTRY SUCCS' N04f9c18 
N 0 4 f 9 c 1 8 


RBL 


- s 


E 


ESEE 


N04 f b9o4 : 


X 


RBL 


B 








N04 f 9c1 8: 


CN_ENTRY in N04f9c18 SUCCS: 












N04f9c18 N04f9e20 N05g38cc 


RBL 


S 


EE 






N05o35cc : 


? i 4 


RBL 


BS 


EE 






N04 f ba64 : 


CN_CONVERT N05o35cc 


RBL 


B 




SE 


E 


N05o32cc: 


?i 3 


RBL 


B 




SE 


E 


N04 f bee4 : 


CN_CONVERT N05o32cc 


RBL 


BS 








N05o2 fee: 


?i 2 


RBL 


BS 








N04 f d 2 64 : 


N05o2 fee 


RBL 


B 








N05o31 4c : 


N05o32cc 


RBL 


B 








N05o344c : 


N05o35cc 


RBL 


B 








N05o374c: 


?c5 


RBL 


B 








N05o308c: 


?i 3 = N05o31 4c 


RBL 


B 








N05a338c: 


?i 4 = N05a344 c 


RBL 


BS 


E 






N05o2 fOc: 


N04 f bo64+N04 f b9o4 


RBL 


B 




SE 




N04 f bco4 ; 


N04 f b9a4 *N04 f bee4 


RBL 


BS 


E 






N04 f bb24 : 


CN_CONVERT N05o2fOc 


RBL 


B 








N04 f o f e4 : 


CN_LT N04 f d264 N05o3 74 c 


RBL 


BS 








N04 f b524 : 


A [N04 f d264 ] = N04 f bb 24 


RBL 


B 








N04 f b764 : 


?i 2 = N04 f d2 64 


^- RBL 


B 








N04 f d564 : 


CN_ I F N04 f o f e4 


RBL 










N04 f 9e20: 


CN_ENTRY SUCCS: N04 f 9dl c 


RBL- 






S 




N04 f bbe4 : 


X = N04 f bco4 


RB 








S 


N04 f 9d I c : 


C N__ E N T R Y SUCCS: N04 f 990c 


RB 
[. . . 


] 






S 


N04 f c2o4 : 


CN_ RETURN 



FIG. 1 2 



BNSDOCID <WO_ 9325&63A1_I_> 



WO 93/25963 

PCT/US93/05227 

8/14 




BNSDOCID; <WO 9325963A 1 I > 



WO 93/25963 



PCT/US93/05227 



9/14 



o 
^3 



o 



E 
a> 
to 

CO 

o 



o 
o 



CO 

to 
o 

O 



CO 

r-~ 

O 
O 

o 

CO 



II 



CO 

O est o 
' — ' o 



OJ 

o 



o 

CO 



CO 



-4- - ^cvi — 

o o i-j • — ■ a o^ki rgn o_ o ^ (\ cm n o o ^ 

- - z o _ - - - ^— ' -----„-. 

O O - — - CO I ojro*— CXI^-rOCNj^- — tO — C3^f 

att= 2 O @ C **= ctb «»= O @) W tf) «=*fcr =*fc CO CO CO 

CO CO ^ 

. * — * * . . * * * CO . to 

• * * . . . .CO . * 

cr> . CO . . ~o «— • ~o . — — T) TJ -o X) — 

a> a> *-t3"o*o-c - o~o~o"o > > ~o ~o ~o -o > 3 — 

c= — _o — — o to — — o — (_> O D o o o o £Z 



CM O 
O CO 



* — — — — — — ^ 

. ' ' * ' * O CO O O) CN t)- 00 O O V CM ID GO O O a> 

xxxxxxxxxxxxxxxxxxxxx 

OOOOOOOOOOOOOOOOOOOOO 

•.-h+-*--h + -H-h + 4- + H-H--|- + 4--+. -h H- -f- ■+■ -+- 



co o O) o oo (J u cn (O oo o a) o cm co o t_> a> c> oj 

in tO if) lO tO CO CO tO CDh-SNNNOOOOOOOOOOOOOO) 

OOOOOOOOOOOOOOOOOCDOOOO 
OOOOOOOCDOOOOOOOOOOOOOO 

oooooooooooooooooooooo 

CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO 
xxxxxxxxxxxxxxxxxxxxxx 

oooooooooooooooooooooo 



5 



BNSOOCID' <WO S325963A1J„> 



WO 93/25963 



PCT/US93/05227 



10/14 



Ol 
CD 



CD 
C 

E 



o> 



o 



o 
CO 



to 
O 



CO 

o> 
<_> 
o 



O 
O 



C/l 

tft 

CD 
O 



o 

CO 



O 



o 

CO 



X 
D 



E 














CD 




c 










t= 


D 




11 






















^. 






.•x_ 






=3 


cu 


II 










o 






11 


O 






a> 




o 


X 


"O 




-O 










-o 






c: 


o 






c 


c 


CO 




■o 






CD 





BNSDOCIO <WO 9325963A1 I > 



WO 93/25963 PCT/US93/05227 



1 1/14 



OO 

to 

A. 



"FI.El 



o 



a> 

CO 
CO 



.2 



o 



a> 
to 

CO 
D 
CO 



> 



O 
O 



to 

CO 

a> 
c_> 
O 



OJ 
OO 
ro 

O 
O 

o 

OO 

o 



o 
O- 



X 



CNJI 
OO 



o 

OO 



OJ 

o 



D to 



I t>j r*"> 



CM 

to o 

«r— CM 

: =**= O 



to 

ao 
o « 

^- ^ 



OO 
OJ 

X - — * 

*— CD O. 

OJ * — O 

O -h ^ 
•fl-CNCNJfOOOr-fO *— ' l_U 

to o to to to to to o o O g) 



CO etts to to to o tn 



CO 



X 

o o o 



to to £ 

* * . . * * * CO . CO — 

* * - • • • co . * * . * — — *- 

"O »— . .-o . — — -O -O -O -O — — • • D . C — 

"O _C *TD "O "O > > "O "O "O "O > 3 — — • »--—-— X X X 

o to — — o — ouooooo£ — to r> en v u a) 

CO o <U CN ^ CO O O O) (N CD CO o o cuoj^j-co o a> OJ 
,— 04 OJ -OJ CN(NCMrO>Onr>KlrO^Tr^<--^in 

xxxxxxxxxxxxxxxxxxxxxx 
OOOOOOOOOOOOOOOOOOOOOO 

+ H--l--4- + -h-4--4--l- + + + -l- + + + -l--*--*-H- + -l- 
CD CD CD CD O CD CD CD CD CD O O O O CD CD CD CD CD CD CD CD 



go u a> oj co oo o a> o oj co o o cuoojcoco u cuojco 
U)CDiDr^N.rvNr-cooocDOOoocDcncncnCTO)C5 o o 

oooooooooooooooooooooo 
oooooooooooooooooooooo 
oooooooooooooooooooooo 
aoaoGOcooocooocoaooooooococoGoaooococooooooo 
xxxxxxxxxxxxxxxxxxxxxx 
OOOOOOOOOOOOOOOOOOOOOO 



C3 



< 



o 



BNSDOCIO <WO 9325963A1 J_> 



WO 93/25963 



PCT/US93/05227 



12/14 



o 

CO 



o 
o 



cn 
to 
o> 

o 



OOI 

oo 







c 








X 




o 








a* 




cn 




c 








E 






c 






C 


o 








eu 


O 




t_ 


a> 


-O 






c 


to 





: : ':-rvll. .. . 

- i ~;---;:::::;:X 



:;!>.;;^.= II. 
tttt: : ;o 



o 



o 

X*^ 

: -p -o 
: . a> cu 



o 



BNSDOCID <WO 9325963A1J_> 



WO 93/25963 



PCI7US93/05227 



( source file} - ' 22 

I 



LEXICAL ANALYSIS OF SOURCE 
FILE TO GENERATE TOKENS 



I 



PARSE SOURCE FILE TO 
GENERATE PARSE TREE 



I 



PROCESS PARSE TREE TO 
GENER ATE SOURCE UNITS 

I 



13/14 
390 



392 



•394 



PROVIDE SOURCE 
UNITS TO DEBUGGER 



396 



I 



SOURCE 
RANGE TABLE 



GENERATE INITIAL NODE 
REPRESENTATION OF SOURCE 
FILE AND ANNOTATE EACH 
NODE WITH CORRESPONDING 
SOURCE UNITS 



■398 



OPTIMIZE SOURCE FILE CODE 

WITH NODE CREATIONS. 
MERGING. DELETION. ETC. AND 
ANNOTATE EACH NEW NODE WITH 
ALL RELATED SOURCE UNITS 




404 



406 



42 



SOURCE 
RANGE TABLE 



408- 



410- 



-28 



FIG. 1 6 



1 



PERFORM CODE GENERATION WITH 
ANOTATED NODES TO PRODUCE 
MACHINE INSTRUCTIONS AND 
ANNOTATE EACH INSTRUCTION 
WITH SOURCE UNITS OF NODE 
WHICH PRODUCED THE INSTRUCTION 



I 



GENERATE A SOURCE RANGE TABLE 
(SRT) FROM ANNOTATED MACHINE 
INSTRUCTIONS AND PROVIDE 
SRT TO DEBUGGER . 



I 



GENERATE BINARY MACHINE 
LANGUAGE INSTRUCTIONS 
FOR EXECUTABLE FILE 



412 



x: 



i 



END 



BNSDOCID <WO 9325S63A1_I_> 



WO 93/25963 



PCT/US93/05227 



14/14 



(executable fileV^ 45 



EXECUTE INSTRUCTION IN THE 
DEBUGGED PROGRAM OF THE 
EXECUTABLE FILE 



I 



DISPLAY A SEQUENCE OF SAID 
MACHINE LANGUAGE INSTRUCTIONS 
INCLUDING SAID EXECUTED 
INSTRUCTIONS 



I 



RELATE THE EXECUTED INSTRUCTION 
PC TO THE CORRESPONDING PC 
RANGE IN THE SRT TO IDENTIFY 
THE CORRESPONDING SOURCE 
UNIT INDEXES . 



I 



RELATE THE SOURCE UNIT INDEXES 
TO THE CORRESPONDING INDEXES 
OF THE SOURCE UNIT TABLE AND 
IDENTIFY THE START AND END 
POSITIONS OF THE CORRESPONDING 
SOURCE UNITS 

DISPLAY THE CORRESPONDING 
SOURCE UNITS 

I 



-430 



432 



•434 



436 



438 



C ^D \ 

FIG. 17 



440 



BNSDOCID: «=WO 932S963A 1 J._> 



INTERNATIONAL SEARCH REPORT ^ /ltr n .„^„ 

PCT/US 93/05227 

International Application No 



I. CLASSIFICATION OK SUBJECT MATTER (if several classification symbols apply, indicate all) 6 



According to International Patent Classification (IPC) or to both National Classification and IPC 

Int. CI. 5 G06F11/00 



U. FIELDS SEARCHED 



Minimum Documentation Searched*' 



Classification System 


Classification Symbols 


Int. CI. 5 


G06F 



Documentation Searched other than Minimum Documentation 
to the Extent that such Documents are Included in the Fields Searched 



UI. DOCUMENTS CONSIDERED TO BE RELEVANT* 



Category* 



Citation of Document, 11 with indication, where appropriate, of the relevant passages 12 



Relevant to Claim No. u 



EP.A.O 317 080 (HEWLETT-PACKARD COMPANY) 
24 May 1989 
see claims 

COMPUTER DESIGN. 

vol. 27, no. 13, July 1988, LITTLETON, 
MASSACHUSETTS US 
pages 48 - 55 

HOWARD FALK 'Optimizing compilers address 
debugging and user control constraints 1 
see page 55, left column, line 29 - right 
column, line 13 

ACM TRANSACTIONS ON PROGRAMMING LANGUAGES 
AND SYSTEMS 

vol. 7, no. 1, January 1985, 
pages 176 - 181 

DAVID WALL ET AL. 'A Note on Hennessy's 
"Symbolic Debugging of Optimized Code 111 



1-10 



1-10 



1-10 



° Special categories of cited documents : 10 

"A* document defining the general state of the art which is not 
considered to be of particular reJevanc* 

*E* earlier document but published on or after the international 
filing date 

"L* document which may throw doubts on priority claim(s) or 
which is cited to establish the publication date of another 
citation or other special reason (as specified) 

'O* document referring to an oral disclosure, use, exhibition or 
othe 



document published prior to the international filing dale but 
later than the priority date claimed 



*T" later document published after the International filing date 
or priority date and not in conflict with the application but 
dted to understand the principle or theory underlying the 
invention 

"X" document of particular relevance; the claimed invention 
cannot be considered novel or cannot be considered to 
involve an inventive step 

*Y* document of particular relevance; the claimed Invention 
cannot be considered to involve an inventive step when the 
document is combined with one or more other such docu- 
ments, such combination being obvious to a person skilled 
In the art. 

document member of the same patent family 



IV. CERTIFICATION 



Date of the Actual Completion of the International Search 

22 SEPTEMBER 1993 


Date of Mailing of this International Search Report 

2 9. 09. 93 


international Search iog Authority 

EUROPEAN PATENT OFFICE 


Signature of Authorized Officer 

Guido Corremans 



Fora PC171&A/210 I 



was) 



BNSDOCID. <WO 9325963A1. I_> 



ANNEX TO THE INTERNATIONAL SEARCH REPORT 

ON INTERNATIONAL PATENT APPLICATION NO. US 9305227 

SA 75661 

Ttel^J^L^l^* 2°?^ 8 t0 me patent do °^ ts "ted in the above-mentioned international search report. 

The members are as contained in (he European Patent Office EDP file on «»m report. 

Tne European Patent Office is in no way liable for these particulars which are merely given for me purpose of information 22/09 /93 



Patent document 
cited in search report 



EP-A-0317080 



Publication 
date 



24-05-89 



Potest family 
membcr^s) 



US-A- 
CA-A- 
JP-A- 



4953084 
1293810 
1166141 



Publication 



28-08-90 
31-12-91 
30-06-89 



at 
O 



u — ■ 

« For more details about this annex : see Official Journal of the European Patent Office, No. 12/82 



BNSDOCID. <WO 9325963A1_L> 



